Overview

Dataset statistics

Number of variables13
Number of observations2973
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory290.5 KiB
Average record size in memory100.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with qty_invoices and 4 other fieldsHigh correlation
recency_days is highly correlated with qty_invoicesHigh correlation
qty_invoices is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with gross_revenue and 2 other fieldsHigh correlation
gross_revenue is highly correlated with qty_invoices and 1 other fieldsHigh correlation
qty_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qty_products is highly correlated with qty_invoicesHigh correlation
avg_ticket is highly correlated with qty_returns and 2 other fieldsHigh correlation
qty_returns is highly correlated with avg_ticket and 2 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 2 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with avg_ticket and 2 other fieldsHigh correlation
gross_revenue is highly correlated with qty_itemns and 1 other fieldsHigh correlation
qty_invoices is highly correlated with qty_itemnsHigh correlation
qty_itemns is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_recency_days is highly correlated with frequency and 1 other fieldsHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
qty_returns is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with qty_itemns and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qty_itemns and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qty_invoices and 6 other fieldsHigh correlation
qty_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 6 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qty_returns is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 53.48019807) Skewed
frequency is highly skewed (γ1 = 24.89613265) Skewed
qty_returns is highly skewed (γ1 = 51.83258039) Skewed
avg_basket_size is highly skewed (γ1 = 44.70180648) Skewed
avg_unique_basket_size is highly skewed (γ1 = 44.70180648) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 34 (1.1%) zeros Zeros
qty_returns has 1484 (49.9%) zeros Zeros

Reproduction

Analysis started2022-05-26 19:36:34.228851
Analysis finished2022-05-26 19:38:24.592546
Duration1 minute and 50.36 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2973
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2320.551295
Minimum0
Maximum5725
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:26.004599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.6
Q1930
median2122
Q33541
95-th percentile5042.8
Maximum5725
Range5725
Interquartile range (IQR)2611

Descriptive statistics

Standard deviation1557.043184
Coefficient of variation (CV)0.6709798604
Kurtosis-1.009966708
Mean2320.551295
Median Absolute Deviation (MAD)1272
Skewness0.3424128767
Sum6898999
Variance2424383.477
MonotonicityStrictly increasing
2022-05-26T16:38:27.710597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
30141
 
< 0.1%
30011
 
< 0.1%
30021
 
< 0.1%
30031
 
< 0.1%
30061
 
< 0.1%
30081
 
< 0.1%
30091
 
< 0.1%
30111
 
< 0.1%
30121
 
< 0.1%
Other values (2963)2963
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57251
< 0.1%
57061
< 0.1%
56961
< 0.1%
56901
< 0.1%
56691
< 0.1%
56651
< 0.1%
56591
< 0.1%
56481
< 0.1%
56471
< 0.1%
56371
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2973
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15269.75479
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-05-26T16:38:29.409601image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12617.8
Q113799
median15220
Q316767
95-th percentile17964.4
Maximum18287
Range5940
Interquartile range (IQR)2968

Descriptive statistics

Standard deviation1718.870036
Coefficient of variation (CV)0.1125669704
Kurtosis-1.205091
Mean15269.75479
Median Absolute Deviation (MAD)1486
Skewness0.03172900873
Sum45396981
Variance2954514.2
MonotonicityNot monotonic
2022-05-26T16:38:31.053646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
161851
 
< 0.1%
146261
 
< 0.1%
148681
 
< 0.1%
182461
 
< 0.1%
171151
 
< 0.1%
166111
 
< 0.1%
159121
 
< 0.1%
126701
 
< 0.1%
175881
 
< 0.1%
Other values (2963)2963
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123631
< 0.1%
123641
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2958
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2746.725883
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:34.916329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.92
Q1570.5
median1084.1
Q32306.52
95-th percentile7211.22
Maximum279138.02
Range279131.82
Interquartile range (IQR)1736.02

Descriptive statistics

Standard deviation10573.74237
Coefficient of variation (CV)3.849580488
Kurtosis354.4086525
Mean2746.725883
Median Absolute Deviation (MAD)668.9
Skewness16.78841992
Sum8166016.05
Variance111804027.6
MonotonicityNot monotonic
2022-05-26T16:38:36.403453image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2053.022
 
0.1%
3312
 
0.1%
1078.962
 
0.1%
1025.442
 
0.1%
598.22
 
0.1%
533.332
 
0.1%
731.92
 
0.1%
379.652
 
0.1%
2092.322
 
0.1%
745.062
 
0.1%
Other values (2948)2953
99.3%
ValueCountFrequency (%)
6.21
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.561
< 0.1%
451
< 0.1%
521
< 0.1%
52.21
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.37975109
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:38.006515image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.75088394
Coefficient of variation (CV)1.20769159
Kurtosis2.765034542
Mean64.37975109
Median Absolute Deviation (MAD)26
Skewness1.794156168
Sum191401
Variance6045.199953
MonotonicityNot monotonic
2022-05-26T16:38:39.616580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
285
 
2.9%
385
 
2.9%
876
 
2.6%
1067
 
2.3%
766
 
2.2%
966
 
2.2%
1764
 
2.2%
2255
 
1.8%
Other values (262)2223
74.8%
ValueCountFrequency (%)
034
 
1.1%
199
3.3%
285
2.9%
385
2.9%
487
2.9%
543
1.4%
766
2.2%
876
2.6%
966
2.2%
1067
2.3%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3652
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

qty_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.718466196
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:41.347563image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.851499485
Coefficient of variation (CV)1.547880005
Kurtosis191.0363992
Mean5.718466196
Median Absolute Deviation (MAD)2
Skewness10.77215129
Sum17001
Variance78.34904314
MonotonicityNot monotonic
2022-05-26T16:38:42.495482image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2788
26.5%
3500
16.8%
4393
13.2%
5237
 
8.0%
1190
 
6.4%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.8%
Other values (46)332
11.2%
ValueCountFrequency (%)
1190
 
6.4%
2788
26.5%
3500
16.8%
4393
13.2%
5237
 
8.0%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.8%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

qty_itemns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1671
Distinct (%)56.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1607.347124
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:44.161507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile102.6
Q1297
median639
Q31399
95-th percentile4406.6
Maximum196844
Range196843
Interquartile range (IQR)1102

Descriptive statistics

Standard deviation5883.760265
Coefficient of variation (CV)3.660541134
Kurtosis466.6007221
Mean1607.347124
Median Absolute Deviation (MAD)420
Skewness17.87003689
Sum4778643
Variance34618634.85
MonotonicityNot monotonic
2022-05-26T16:38:45.573356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
1509
 
0.3%
889
 
0.3%
2888
 
0.3%
2608
 
0.3%
2468
 
0.3%
848
 
0.3%
2728
 
0.3%
12007
 
0.2%
3307
 
0.2%
Other values (1661)2890
97.2%
ValueCountFrequency (%)
11
< 0.1%
22
0.1%
122
0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
191
< 0.1%
201
< 0.1%
231
< 0.1%
251
< 0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

qty_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct468
Distinct (%)15.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.6152035
Minimum1
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:46.765651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7838
Range7837
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.7316209
Coefficient of variation (CV)2.199821989
Kurtosis355.2780943
Mean122.6152035
Median Absolute Deviation (MAD)44
Skewness15.71641959
Sum364535
Variance72755.14731
MonotonicityNot monotonic
2022-05-26T16:38:47.820633image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2843
 
1.4%
2037
 
1.2%
3535
 
1.2%
2935
 
1.2%
1934
 
1.1%
1533
 
1.1%
1132
 
1.1%
2631
 
1.0%
2730
 
1.0%
2530
 
1.0%
Other values (458)2633
88.6%
ValueCountFrequency (%)
16
 
0.2%
214
0.5%
316
0.5%
417
0.6%
526
0.9%
629
1.0%
718
0.6%
819
0.6%
926
0.9%
1028
0.9%
ValueCountFrequency (%)
78381
< 0.1%
56731
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2970
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.85483961
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:48.915670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.919753553
Q113.12129032
median17.97438356
Q324.97962963
95-th percentile90.49133333
Maximum56157.5
Range56155.34941
Interquartile range (IQR)11.85833931

Descriptive statistics

Standard deviation1036.237034
Coefficient of variation (CV)19.9834199
Kurtosis2894.600567
Mean51.85483961
Median Absolute Deviation (MAD)5.977291363
Skewness53.48019807
Sum154164.4382
Variance1073787.19
MonotonicityNot monotonic
2022-05-26T16:38:49.920078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.478333332
 
0.1%
152
 
0.1%
4.1622
 
0.1%
15.940882351
 
< 0.1%
22.87926231
 
< 0.1%
20.511041671
 
< 0.1%
149.0251
 
< 0.1%
21.474358971
 
< 0.1%
12.9491
 
< 0.1%
13.927368421
 
< 0.1%
Other values (2960)2960
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
3202.921
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1258
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.33918498
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:50.974082image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.92857143
median48.25
Q385.33333333
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.4047619

Descriptive statistics

Standard deviation63.52149965
Coefficient of variation (CV)0.9433066299
Kurtosis4.889771368
Mean67.33918498
Median Absolute Deviation (MAD)26.25
Skewness2.062955526
Sum200199.397
Variance4034.980917
MonotonicityNot monotonic
2022-05-26T16:38:52.070243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1425
 
0.8%
422
 
0.7%
7021
 
0.7%
720
 
0.7%
3519
 
0.6%
4918
 
0.6%
4617
 
0.6%
2117
 
0.6%
1117
 
0.6%
616
 
0.5%
Other values (1248)2781
93.5%
ValueCountFrequency (%)
116
0.5%
1.51
 
< 0.1%
213
0.4%
2.51
 
< 0.1%
2.6013986011
 
< 0.1%
315
0.5%
3.3214285711
 
< 0.1%
3.3303571431
 
< 0.1%
3.52
 
0.1%
422
0.7%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3631
 
< 0.1%
3621
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1225
Distinct (%)41.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1136936504
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:53.181697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008896801846
Q10.01633986928
median0.02590673575
Q30.04945054945
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.03311068017

Descriptive statistics

Standard deviation0.4078915909
Coefficient of variation (CV)3.587637388
Kurtosis990.6267355
Mean0.1136936504
Median Absolute Deviation (MAD)0.01218850234
Skewness24.89613265
Sum338.0112225
Variance0.1663755499
MonotonicityNot monotonic
2022-05-26T16:38:54.234354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1198
 
6.7%
0.062518
 
0.6%
0.0277777777817
 
0.6%
0.0238095238116
 
0.5%
0.0344827586215
 
0.5%
0.0909090909115
 
0.5%
0.0833333333315
 
0.5%
0.0294117647114
 
0.5%
0.0357142857113
 
0.4%
0.0769230769213
 
0.4%
Other values (1215)2639
88.8%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
26
 
0.2%
1.1428571431
 
< 0.1%
1198
6.7%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

qty_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct214
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.07399933
Minimum0
Maximum80995
Zeros1484
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:55.465661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100.4
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1511.479652
Coefficient of variation (CV)24.34964185
Kurtosis2769.251267
Mean62.07399933
Median Absolute Deviation (MAD)1
Skewness51.83258039
Sum184546
Variance2284570.738
MonotonicityNot monotonic
2022-05-26T16:38:56.563547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01484
49.9%
1164
 
5.5%
2149
 
5.0%
3105
 
3.5%
489
 
3.0%
678
 
2.6%
561
 
2.1%
1251
 
1.7%
843
 
1.4%
743
 
1.4%
Other values (204)706
23.7%
ValueCountFrequency (%)
01484
49.9%
1164
 
5.5%
2149
 
5.0%
3105
 
3.5%
489
 
3.0%
561
 
2.1%
678
 
2.6%
743
 
1.4%
843
 
1.4%
941
 
1.4%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1980
Distinct (%)66.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.779818
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:38:57.784598image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.3
median172.3333333
Q3281.6923077
95-th percentile600
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.3923077

Descriptive statistics

Standard deviation791.0287722
Coefficient of variation (CV)3.16690427
Kurtosis2258.51001
Mean249.779818
Median Absolute Deviation (MAD)83
Skewness44.70180648
Sum742595.399
Variance625726.5185
MonotonicityNot monotonic
2022-05-26T16:38:58.859680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
11410
 
0.3%
829
 
0.3%
739
 
0.3%
869
 
0.3%
608
 
0.3%
888
 
0.3%
1368
 
0.3%
758
 
0.3%
2887
 
0.2%
Other values (1970)2886
97.1%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
28011
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1980
Distinct (%)66.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.779818
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2022-05-26T16:39:00.009597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.3
median172.3333333
Q3281.6923077
95-th percentile600
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.3923077

Descriptive statistics

Standard deviation791.0287722
Coefficient of variation (CV)3.16690427
Kurtosis2258.51001
Mean249.779818
Median Absolute Deviation (MAD)83
Skewness44.70180648
Sum742595.399
Variance625726.5185
MonotonicityNot monotonic
2022-05-26T16:39:01.131612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
11410
 
0.3%
829
 
0.3%
739
 
0.3%
869
 
0.3%
608
 
0.3%
888
 
0.3%
1368
 
0.3%
758
 
0.3%
2887
 
0.2%
Other values (1970)2886
97.1%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
28011
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%

Interactions

2022-05-26T16:37:51.981554image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:36.689281image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:41.183163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:46.965403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:51.951157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:57.485120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:02.682693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:08.678282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:15.095968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:20.983078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:27.553824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:34.196176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:41.488065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:54.967410image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:37.016603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:41.525953image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:47.322205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:52.336383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:57.892860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:03.143441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:09.140078image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:15.526705image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:21.440839image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:27.984562image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:34.701952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:41.984796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:57.733300image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:37.336819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:41.881743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:47.687025image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:52.698181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:58.274619image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:03.591167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:09.598321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:15.939462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:21.904187image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:28.536203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:35.174733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:42.469037image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:00.283654image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:37.644644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:42.217549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:48.044363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:53.086972image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:58.652331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:04.020905image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:10.036130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:16.394853image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:22.381928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:28.979973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:35.676960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:42.921818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:02.291458image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:37.979640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:42.571357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:48.418133image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:53.491255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:59.030117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:04.496167image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:10.507860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:16.849581image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:22.887196image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:29.471709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:36.443699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:43.468067image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:03.469735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:38.300716image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:42.882181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:48.762459image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:53.904024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:59.396420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:04.920943image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:10.932186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:17.303592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:23.302007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:29.891984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:37.026861image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:43.939764image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:05.188738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:38.664117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:43.245990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:49.153239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:54.341315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:59.809211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:05.409163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:11.404419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:17.784301image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:23.829744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:30.396717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:37.673999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:44.481964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:09.142897image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:39.020903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:43.640296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:49.551993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:54.744098image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:00.227976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:05.891908image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:12.185936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:18.223075image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:24.302000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:30.872983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:38.336747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:45.354973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:10.825865image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:39.344375image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:44.973069image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:49.918295image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:55.136407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:00.584303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:06.326153image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:12.638661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:18.616831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:24.751743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:31.323755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:38.835983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:46.108071image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:12.381963image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:39.691163image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:45.375352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:50.313522image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:55.529207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:00.996105image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:06.770875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:13.122746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:19.079099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:25.488800image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:31.863501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:39.361689image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:47.203042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:14.071542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:40.060935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:45.800093image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:50.685318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:55.974965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:01.422869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:07.238101image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:13.588786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:19.577789image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:26.148077image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:32.350249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:39.880884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:48.254102image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:15.782559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:40.434058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:46.185317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:51.116112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:56.630852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:01.846148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:07.702253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:14.076818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:20.013063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:26.643780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:33.181185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:40.454091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:49.469774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:38:17.502424image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:40.813391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:46.565126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:51.527406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:36:57.046378image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:02.256409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:08.181015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:14.573298image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:20.500819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:27.139911image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:33.691960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:40.960822image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-26T16:37:50.613110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-26T16:39:02.177296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-26T16:39:03.642132image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-26T16:39:05.083470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-26T16:39:06.501274image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-26T16:38:19.921933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-26T16:38:23.459851image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqty_invoicesqty_itemnsqty_productsavg_ticketavg_recency_daysfrequencyqty_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.01733.0297.018.15222235.50000017.00000040.050.97058850.970588
11130473232.5956.09.01390.0171.018.90403527.2500000.02830235.0154.444444154.444444
22125836705.382.015.05028.0232.028.90250023.1875000.04032350.0335.200000335.200000
3313748948.2595.05.0439.028.033.86607192.6666670.0179210.087.80000087.800000
4415100876.00333.03.080.03.0292.0000008.6000000.07317122.026.66666726.666667
55152914623.3025.014.02102.0102.045.32647123.2000000.04011529.0150.142857150.142857
66146885630.877.021.03621.0327.017.21978618.3000000.057221399.0172.428571172.428571
77178095411.9116.012.02057.061.088.71983635.7000000.03352041.0171.416667171.416667
881531160767.900.091.038194.02379.025.5434644.1444440.243316474.0419.714286419.714286
99160982005.6387.07.0613.067.029.93477647.6666670.0243900.087.57142987.571429

Last rows

df_indexcustomer_idgross_revenuerecency_daysqty_invoicesqty_itemnsqty_productsavg_ticketavg_recency_daysfrequencyqty_returnsavg_basket_sizeavg_unique_basket_size
29635637177271060.2515.01.0645.066.016.0643946.01.0000006.0645.000000645.000000
2964564717232421.522.02.0203.036.011.70888912.00.1538460.0101.500000101.500000
2965564817468137.0010.02.0116.05.027.4000004.00.4000000.058.00000058.000000
2966565913596697.045.02.0406.0166.04.1990367.00.2500000.0203.000000203.000000
29675665148931237.859.02.0799.073.016.9568492.00.6666670.0399.500000399.500000
2968566912479473.2011.01.0382.030.015.7733334.01.00000034.0382.000000382.000000
2969569014126706.137.03.0508.015.047.0753333.00.75000050.0169.333333169.333333
29705696135211092.391.03.0733.0435.02.5112414.50.3000000.0244.333333244.333333
2971570615060301.848.04.0262.0120.02.5153331.02.0000000.065.50000065.500000
2972572512558269.967.01.0196.011.024.5418186.01.000000196.0196.000000196.000000